Search CORE

INRIA a CCSD electronic archive server

HAL Descartes

Use of the score test as a goodness-of-fit measure of the covariance structure in genetic analysis of longitudinal data

Author: Jaffrézic Florence
Thompson Robin
White Ian MS
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

Model selection is an essential issue in longitudinal data analysis since many different models have been proposed to fit the covariance structure. The likelihood criterion is commonly used and allows to compare the fit of alternative models. Its value does not reflect, however, the potential improvement that can still be reached in fitting the data unless a reference model with the actual covariance structure is available. The score test approach does not require the knowledge of a reference model, and the score statistic has a meaningful interpretation in itself as a goodness-of-fit measure. The aim of this paper was to show how the score statistic may be separated into the genetic and environmental parts, which is difficult with the likelihood criterion, and how it can be used to check parametric assumptions made on variance and correlation parameters. Selection of models for genetic analysis was applied to a dairy cattle example for milk production

Rothamsted Repository

Estimation of genetic parameters for test day records of dairy traits in the first three lactations

Author: Druet Tom
Ducrocq Vincent
Jaffrézic Florence
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

Application of test-day models for the genetic evaluation of dairy populations requires the solution of large mixed model equations. The size of the (co)variance matrices required with such models can be reduced through the use of its first eigenvectors. Here, the first two eigenvectors of (co)variance matrices estimated for dairy traits in first lactation were used as covariables to jointly estimate genetic parameters of the first three lactations. These eigenvectors appear to be similar across traits and have a biological interpretation, one being related to the level of production and the other to persistency. Furthermore, they explain more than 95% of the total genetic variation. Variances and heritabilities obtained with this model were consistent with previous studies. High correlations were found among production levels in different lactations. Persistency measures were less correlated. Genetic correlations between second and third lactations were close to one, indicating that these can be considered as the same trait. Genetic correlations within lactation were high except between extreme parts of the lactation. This study shows that the use of eigenvectors can reduce the rank of (co)variance matrices for the test-day model and can provide consistent genetic parameters

Open Repository and Bibliography - Liège

EM-REML estimation of covariance parameters in Gaussian mixed models for longitudinal data analysis

Author: Foulley Jean-Louis
Jaffrézic Florence
Robert-Granié Christèle
Publication venue: BioMed Central
Publication date: 01/01/2000
Field of study

This paper presents procedures for implementing the EM algorithm to compute REML estimates of variance covariance components in Gaussian mixed models for longitudinal data analysis. The class of models considered includes random coefficient factors, stationary time processes and measurement errors. The EM algorithm allows separation of the computations pertaining to parameters involved in the random coefficient factors from those pertaining to the time processes and errors. The procedures are illustrated with Pothoff and Roy's data example on growth measurements taken on 11 girls and 16 boys at four ages. Several variants and extensions are discussed

A quasi-score approach to the analysis of ordered categorical data via a mixed heteroskedastic threshold model

Author: Foulley Jean-Louis
Jaffrézic Florence
Robert-Granié Christèle
Publication venue: BioMed Central
Publication date: 01/01/1999
Field of study

This article presents an extension of the methodology developed by Gilmour et al. [19], for ordered categorical data, taking into account the heterogeneity of residual variances of latent variables. Heterogeneity of residual variances is described via a structural linear model on log-variances. This method involves two main steps: i) a ’marginalization’ with respect to the random effects leading to quasi-score estimators; ii) an approximation of the variance-covariance matrix of the observations which leads to an analogue of the Henderson mixed model equations for continuous Gaussian data. This methodology is illustrated by a numerical example of footshape in sheep.Cet article présente une extension de la méthodologie développée par Gilmour et al. [19] dans le cas de variables qualitatives ordonnées, prenant en compte l’hétérogénéité des variances résiduelles des variables latentes. L’hétérogénéité des variances résiduelles est décrite par un modèle linéaire structurel sur les logarithmes des variances. Cette méthode comprend deux étapes principales : i) une « marginalisation » par rapport aux effets aléatoires qui conduit, grâce aux équations de quasi-score, à l’estimation des paramètres ; ii) une approximation de la matrice de variance-covariance des observations qui aboutit à un système analogue aux équations du modèle mixte d’Henderson dans le cas de variables continues gaussiennnes. Cette méthodologie est illustrée par un exemple sur la forme des pieds chez le mouton

ProdInra

Genetic analysis of growth curves using the SAEM algorithm

Author: Foulley Jean-Louis
Jaffrézic Florence
Lavielle Marc
Meza Cristian
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

The analysis of nonlinear function-valued characters is very important in genetic studies, especially for growth traits of agricultural and laboratory species. Inference in nonlinear mixed effects models is, however, quite complex and is usually based on likelihood approximations or Bayesian methods. The aim of this paper was to present an efficient stochastic EM procedure, namely the SAEM algorithm, which is much faster to converge than the classical Monte Carlo EM algorithm and Bayesian estimation procedures, does not require specification of prior distributions and is quite robust to the choice of starting values. The key idea is to recycle the simulated values from one iteration to the next in the EM algorithm, which considerably accelerates the convergence. A simulation study is presented which confirms the advantages of this estimation procedure in the case of a genetic analysis. The SAEM algorithm was applied to real data sets on growth measurements in beef cattle and in chickens. The proposed estimation procedure, as the classical Monte Carlo EM algorithm, provides significance tests on the parameters and likelihood based model comparison criteria to compare the nonlinear models with other longitudinal methods

Detection and modelling of time-dependent QTL in animal populations

Author: Florence Jaffrézic
Mogens S. Lund
Per Madsen
Peter Sorensen
Publication venue: 'EDP Sciences'
Publication date: 01/01/2008
Field of study

A longitudinal approach is proposed to map QTL affecting function-valued traits and to estimate their effect over time. The method is based on fitting mixed random regression models. The QTL allelic effects are modelled with random coefficient parametric curves and using a gametic relationship matrix. A simulation study was conducted in order to assess the ability of the approach to fit different patterns of QTL over time. It was found that this longitudinal approach was able to adequately fit the simulated variance functions and considerably improved the power of detection of time-varying QTL effects compared to the traditional univariate model. This was confirmed by an analysis of protein yield data in dairy cattle, where the model was able to detect QTL with high effect either at the beginning or the end of the lactation, that were not detected with a simple 305 day model

arXiv.org e-Print Archive

ProdInra

Reverse engineering gene regulatory networks using approximate Bayesian computation

Author: Doerge R. W.
Foulley Jean-Louis
Jaffrézic Florence
Rau Andrea
Publication venue
Publication date: 01/01/2011
Field of study

Gene regulatory networks are collections of genes that interact with one other and with other substances in the cell. By measuring gene expression over time using high-throughput technologies, it may be possible to reverse engineer, or infer, the structure of the gene network involved in a particular cellular process. These gene expression data typically have a high dimensionality and a limited number of biological replicates and time points. Due to these issues and the complexity of biological systems, the problem of reverse engineering networks from gene expression data demands a specialized suite of statistical tools and methodologies. We propose a non-standard adaptation of a simulation-based approach known as Approximate Bayesian Computing based on Markov chain Monte Carlo sampling. This approach is particularly well suited for the inference of gene regulatory networks from longitudinal data. The performance of this approach is investigated via simulations and using longitudinal expression data from a genetic repair system in Escherichia coli.Comment: 16 pages, 11 figure

CiteSeerX

INRIA a CCSD electronic archive server

A quasi-score approach to the analysis of ordered categorical data via a mixed heteroskedastic threshold model

Author: Christèle Robert-Granié
Florence Jaffrézic
Jean-Louis Foulley
Publication venue: 'EDP Sciences'
Publication date: 01/01/2007
Field of study